Improving the Efficiency of Interactive Sequential Pattern Mining by Incremental Pattern Discovery
نویسندگان
چکیده
The discovery of sequential patterns, which extends beyond frequent item-set finding of association rule mining, has become a challenging task due to its complexity. Essentially, a user would specify a minimum support threshold with respect to the database to find out the desired patterns. The mining process is usually iterative since the user must try various thresholds to obtain the satisfactory result. Therefore, the time-consuming process has to be repeated several times. However, current approaches are inadequate for such process due to the long execution time required for each trial. In order to minimize the total execution time and the response time for each trial, we propose a knowledge base assisted algorithm for interactive sequence discovery, called KISP. KISP constructs a knowledge base accumulating the pattern information in individual mining, eliminates considerable amount of potential patterns to facilitate efficient support counting, and speeds up the whole process. In addition, we further optimize the algorithm by direct generations of the reduced candidate sets and concurrent counting of variable sized candidates. For some queries, KISP may eliminate database access completely. The conducted experiments show that KISP outperforms GSP, a state-of-the-art sequence mining algorithm, by several orders of magnitudes for interactive
منابع مشابه
Interactive sequence discovery by incremental mining
Sequential pattern mining has become a challenging task in data mining due to its complexity. Essentially, the mining algorithms discover all the frequent patterns meeting the user specified minimum support threshold. However, it is very unlikely that the user could obtain the satisfactory patterns in just one query. Usually the user must try various support thresholds to mine the database for ...
متن کاملSupporting Interactive Sequential Pattern Discovery in Databases
One of the most important data mining problems is discovery of sequential patterns. Sequential pattern mining consists in discovering all frequently occurring subsequences in a collection of data sequences. This paper discusses several issues concerning possible extensions to traditional database management systems required to support sequential pattern discovery: a sequential pattern query lan...
متن کاملSingle-pass incremental and interactive mining for weighted frequent patterns
Weighted frequent pattern (WFP) mining is more practical than frequent pattern mining because it can consider different semantic significance (weight) of the items. For this reason, WFP mining becomes an important research issue in data mining and knowledge discovery. However, existing algorithms cannot be applied for incremental and interactive WFP mining and also for stream data mining becaus...
متن کاملPure Incremental Approach for Sequential Pattern Mining
In data mining, mining sequential pattern from a very huge amount of database is very useful in many applications. Most of sequential pattern mining algorithms works on static data means the database should not change. But the databases in today’s real world application do not have static data, rather they are incremental databases. New transactions are added at some intervals of time in databa...
متن کاملApproaches for Pattern Discovery Using Sequential Data Mining
In this chapter we first introduce sequence data. We then discuss different approaches for mining of patterns from sequence data, studied in literature. Apriori based methods and the pattern growth methods are the earliest and the most influential methods for sequential pattern mining. There is also a vertical format based method which works on a dual representation of the sequence database. Wo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003